Algorithms and performance of load-balancing with multiple hash functions in massive content distribution
نویسندگان
چکیده
We consider a two-tier content distribution system for distributing massive content, consisting of an infrastructure content distribution network (CDN) and a large number of ordinary clients. The nodes of the infrastructure network form a structured, distributedhash-table-based (DHT) peer-to-peer (P2P) network. Each file is first placed in the CDN, and possibly, is replicated among the infrastructure nodes depending on its popularity. In such a system, it is particularly pressing to have proper load-balancing mechanisms to relieve server or network overload. The subject of the paper is on popularity-based file replication techniques within the CDN using multiple hash functions. Our strategy is to set aside a large number of hash functions. When the demand for a file exceeds the overall capacity of the current servers, a previously unused hash function is used to obtain a new node ID where the file will be replicated. The central problems are how to choose an unused hash function when replicating a file and how to choose a used hash function when requesting the file. Our solution to the file replication problem is to choose the unused hash function with the smallest index, and our solution to the file request problem is to choose a used hash function uniformly at random. Our main contribution is that we have developed a set of distributed, robust algorithms to implement the above solutions and we have evaluated their performance. In particular, we have analyzed a random binary search algorithm for file request and a random gap removal algorithm for failure recovery. 2008 Elsevier B.V. All rights reserved.
منابع مشابه
Comparison of Load Balancing Algorithms for Structured Peer-to-Peer Systems
Among other things, Peer-to-Peer (P2P) systems are very useful for managing large amounts of widely distributed data. Distributed Hash Tables (DHT) offer a highly scalable and self-organizing approach for efficient and persistent distribution and retrieval of data. However the scalability and performance of DHTs is strongly based on an equal distribution of data across participating nodes. Beca...
متن کاملInternet Traffic Distribution over Multilink Where High Bandwidth Scalable Switch Port Aggregates Multiple Physical Links
A logical link composed of multiple physical links is in extensive use in today’s Internet and its use is growing due to good scalability, reliability and cost-effectiveness. When IP packets are distributed over such physical links, load unbalancing and packet reordering may occur. Since packet reordering degrades TCP performance, a good traffic distribution method must reduce the amount of reo...
متن کاملFinding the Optimal Path to Restoration Loads of Power Distribution Network by Hybrid GA-BCO Algorithms Under Fault and Fuzzy Objective Functions with Load Variations
In this paper proposes a fuzzy multi-objective hybrid Genetic and Bee colony optimization algorithm(GA-BCO) to find the optimal restoration of loads of power distribution network under fault.Restoration of distribution systems is a complex combinatorial optimization problem that should beefficiently restored in reasonable time. To improve the efficiency of restoration and facilitate theactivity...
متن کاملAn Improved Technique Of Extracting Frequent Itemsets From Massive Data Using MapReduce
The mining of frequent itemsets is a basic and essential work in many data mining applications. Frequent itemsets extraction with frequent pattern and rules boosts the applications like Association rule mining, co-relations also in product sale and marketing. In extraction process of frequent itemsets there are number of algorithms used Like FP-growth,E-clat etc. But unfortunately these algorit...
متن کاملA Dynamic Popularity-Aware Load Balancing Algorithm for Structured P2P Systems
Load balancing is one of the main challenges of structured P2P systems that use distributed hash tables (DHT) to map data items (objects) onto the nodes of the system. In a typical P2P system with N nodes, the use of random hash functions for distributing keys among peer nodes can lead to O(log N) imbalance. Most existing load balancing algorithms for structured P2Psystems are not proximity-awa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer Networks
دوره 53 شماره
صفحات -
تاریخ انتشار 2009